Generalized Min-Max Kernel and Generalized Consistent Weighted Sampling
نویسنده
چکیده
We propose the “generalized min-max” (GMM) kernel as a measure of data similarity, where data vectors can have both positive and negative entries. GMM is positive definite as there is an associate hashing method named “generalized consistent weighted sampling” (GCWS) which linearizes this (nonlinear) kernel. A natural competitor of GMM is the radial basis function (RBF) kernel, whose corresponding hashing method is known as the “random Fourier features” (RFF). An extensive experimental study on classifications of 50 publicly available datasets demonstrates that both the GMM and RBF kernels can often substantially improve over linear classifiers. Furthermore, the GCWS hashing method typically requires substantially fewer samples than RFF in order to achieve similar classification accuracies. To understand the property of random Fourier features (RFF), we derive the theoretical variance of RFF, which reveals that the variance of RFF has a term that does not vanish at any similarity. In comparison, the variance of GCWS approaches zero at certain similarities. Overall, the relative (to the expectation) variance of RFF is substantially larger than the relative variance of GCWS. This helps explain the superb empirical results of GCWS compared to RFF. We expect that GMM and GCWS will be adopted in practice for large-scale statistical machine learning applications and efficient near neighbor search (as GMM generates discrete hash values).
منابع مشابه
Generalized Intersection Kernel
Following the very recent line of work on the “generalized min-max” (GMM) kernel [7], this study proposes the “generalized intersection” (GInt) kernel and the related “normalized generalized min-max” (NGMM) kernel. In computer vision, the (histogram) intersection kernel has been popular, and the GInt kernel generalizes it to data which can have both negative and positive entries. Through an ext...
متن کاملNystrom Method for Approximating the GMM Kernel
The GMM (generalized min-max) kernel was recently proposed [5] as a measure of data similarity and was demonstrated effective in machine learning tasks. In order to use the GMM kernel for large-scale datasets, the prior work resorted to the (generalized) consistent weighted sampling (GCWS) to convert the GMM kernel to linear kernel. We call this approach as “GMM-GCWS”. In the machine learning l...
متن کاملGeneralized weighted fairness criterion: formulation and application on prioritized ABR service
In this paper, a generalized fairness criterion referred to as the Generalized Weighted Fairness Criterion (GWFC) for flow control on ABR service is presented. Within the GWFC framework, a weight is assigned to each ABR connection and bandwidth is allocated to it in proportion to the corresponding weight. The GWFC can generalize fairness sub-criteria for prioritized services, and in particular,...
متن کاملTitle Generalized skew bisubmodularity: A characterization and a min‒max theorem
Huber, Krokhin, and Powell (2013) introduced a concept of skew bisubmodularity, as a generalization of bisubmodularity, in their complexity dichotomy theorem for valued constraint satisfaction problems over the three-value domain. In this paper we consider a natural generalization of the concept of skew bisubmodularity and show a connection between the generalized skew bisubmodularity and a con...
متن کاملGeneralized skew bisubmodularity: A characterization and a min-max theorem
Huber, Krokhin, and Powell (Proc. SODA2013) introduced a concept of skew bisubmodularity, as a generalization of bisubmodularity, in their complexity dichotomy theorem for valued constraint satisfaction problems over the three-value domain. In this paper we consider a natural generalization of the concept of skew bisubmodularity and show a connection between the generalized skew bisubmodularity...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1605.05721 شماره
صفحات -
تاریخ انتشار 2016